Correlation of internal representations in feed-forward neural networks

نویسنده

  • A Engel
چکیده

Feed-forward multilayer neural networks implementing random input–output mappings develop characteristic correlations between the activity of their hidden nodes which are important for the understanding of the storage and generalization performance of the network. It is shown how these correlations can be calculated from the joint probability distribution of the aligning fields at the hidden units for arbitrary decoder function between hidden layer and output. Explicit results are given for the parity-, and-, and committee-machines with arbitrary numbers of hidden nodes near saturation. Multilayer neural networks (MLN) are powerful information processing devices. Because of their computational abilities they are the workhorses in practical applications of neural networks and a lot of effort is devoted to a thorough understanding of their functional principles. At the same time, their theoretical analysis within the framework of statistical mechanics is much harder than that for the single-layer perceptron. It was realized from the beginning that the properties of the internal representations defined as the activity patterns of the hidden units resulting from certain inputs are crucial for the understanding of the storage and generalization abilities of MLN [1–5]. Qualitatively, the flexibility of MLN stems from the fact that the different subperceptrons between input and hidden layer can share the effort to produce the correct output. This division of labour gives rise to particular correlations between the activity of the hidden nodes. Near saturation these correlations become a characteristic feature of the decoder function between hidden units and output of the MLN under consideration and determine different aspects of its performance. Several ad hoc approximations have been used to calculate these correlations, e.g., it was assumed that all internal representations giving the correct output (so-called legal internal representations, LIR) are equiprobable [2, 3] or that only internal representations at the decision boundary of the decoder function occur [3]. In this letter we show how these correlations between the hidden units can be calculated for a MLN of tree-architecture and give explicit results for the parity(PAR), and(AND) and committee(COM) machine with arbitrary number K of hidden nodes near saturation. A MLN of tree-architecture is given by N input nodes ξik grouped into K sets of N/K nodes each, K hidden nodes τk and one output σ . The inputs ξk = {ξik, i = 1, . . . , N/K} are coupled to the kth hidden unit by spherical couplings Jk = {Jik ∈ R, i = 1, . . . , N/K, J2 k = † E-mail: [email protected] 0305-4470/96/130323+05$19.50 c © 1996 IOP Publishing Ltd L323 L324 Letter to the Editor N/K} according to τk = sgn(Jkξk). Each hidden node has therefore its own set of inputs (non-overlapping receptive fields). The hidden units τk determine the output through a fixed Boolean function F({τk}). A set of input–output mappings {ξμk , σμ}, μ = 1, . . . , p is generated at random where each bit is ±1 with equal probability. The couplings Jk are then adjusted in such a way that the MLN gives the desired output σ for each input ξμk . This is generically possible only if p/N = α < αc. We are interested in the correlations cn = 1 αN αN ∑ μ=1 τ μ 1 τ μ 2 τ μ 3 · · · τ n (1) near saturation, i.e. for α → αc. From the statistical properties of the inputs it follows that for permutation invariant Boolean functions F({τk}) cn = 〈〈τk1τk2 · · · τkn〉〉 (2) where 〈〈· · ·〉〉 denotes the average over the input–output pairs and k1, . . . , kn is any set containing n different natural numbers between 1 and K . The cn can be calculated from the joint probability distribution of internal representations P(τ1, . . . , τk) = 〈〈∫ ∏K k=1 dμ(Jk) ∏K k=1 θ(τkJkξ 1 k) ∏ μ θ(σ F ({sgn(Jkξμk })) ∫ ∏K k=1 dμ(Jk) ∏K k=1 ∏ μ θ(σ μF ({sgn(Jkξμk )})) 〉〉 (3) The calculation of P(τ1, . . . , τk) parallels the determination of the local aligning field distribution for the perceptron [6, 7] (see also [8, 2, 9]). The general result within replica symmetry is P(τ1, . . . , τk) = 〈〈 δσ,F ({τk}) ∫ ∏ k Dtk ∏ k H(Qtkτk) Trηk ∏ k H(Qtkηk) 〉〉

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effect of sound classification by neural networks in the recognition of human hearing

In this paper, we focus on two basic issues: (a) the classification of sound by neural networks based on frequency and sound intensity parameters (b) evaluating the health of different human ears as compared to of those a healthy person. Sound classification by a specific feed forward neural network with two inputs as frequency and sound intensity and two hidden layers is proposed. This process...

متن کامل

PREDICTION OF COMPRESSIVE STRENGTH AND DURABILITY OF HIGH PERFORMANCE CONCRETE BY ARTIFICIAL NEURAL NETWORKS

Neural networks have recently been widely used to model some of the human activities in many areas of civil engineering applications. In the present paper, artificial neural networks (ANN) for predicting compressive strength of cubes and durability of concrete containing metakaolin with fly ash and silica fume with fly ash are developed at the age of 3, 7, 28, 56 and 90 days. For building these...

متن کامل

Modeling of Resilient Modulus of Asphalt Concrete Containing Reclaimed Asphalt Pavement using Feed-Forward and Generalized Regression Neural Networks

Reclaimed asphalt pavement (RAP) is one of the waste materials that highway agencies promote to use in new construction or rehabilitation of highways pavement. Since the use of RAP can affect the resilient modulus and other structural properties of flexible pavement layers, this paper aims to employ two different artificial neural network (ANN) models for modeling and evaluating the effects of ...

متن کامل

Numerical treatment for nonlinear steady flow of a third grade‎ fluid in a porous half space by neural networks optimized

In this paper‎, ‎steady flow of a third-grade fluid in a porous half‎ space has been considered‎. ‎This problem is a nonlinear two-point‎ boundary value problem (BVP) on semi-infinite interval‎. ‎The‎ solution for this problem is given by a numerical method based on the feed-forward artificial‎ neural network model using radial basis activation functions trained with an interior point method‎. ...

متن کامل

STRUCTURAL DAMAGE DETECTION BY MODEL UPDATING METHOD BASED ON CASCADE FEED-FORWARD NEURAL NETWORK AS AN EFFICIENT APPROXIMATION MECHANISM

Vibration based techniques of structural damage detection using model updating method, are computationally expensive for large-scale structures. In this study, after locating precisely the eventual damage of a structure using modal strain energy based index (MSEBI), To efficiently reduce the computational cost of model updating during the optimization process of damage severity detection, the M...

متن کامل

Neural Prediction of Buckling Capacity of Stiffened Cylindrical Shells

Estimation of the nonlinear buckling capacity of thin walled shells is one of the most important aspects of structural mechanics. In this study the axial buckling load of 132 stiffened shells were numerically calculated. The applicability of artificial neural networks (ANN) in predicting the buckling capacity of vertically stiffened shells was studied. To this end feed forward (FF) multi-layer ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996